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KEY ISSUES IN THE ANALYSIS OF REMOTE SENSING DATA 


A Report on the Workshop 
June 22-23, 1981 
Purdue University 
West Lafayette, Indiana 


Philip H. Swain 


INTRODUCTION 

The year 1981 found the remote sensing oommunity assessing the 
results of con^leted applications-oriented teats of the remote sensing 
technology and looking ahead with great anticipation to new opportuni- 
ties for advancing the technology and broadening its use. For example, 
the Large Area Crop Inventory Experiment (LACIE)[1] and the development 
of a Forest Resources Information System (FRIS) for ooranercial applica- 
tion [2] had demonstrated the capabilities and limitations of the 
raid-1970's technology. The future availability of new sensors, includ- 
ing the Thematic Mapper and the French SPOT multispectral sensor, plus 
the anticipation of renewed research support from NASA through a new 
fundamental research program provided motivation for understanding 
clearly both the current status of the technology and the directions 
which future research must take to best utilize remote sensing. 

These considerations stimulated the convening of a Workshop on Key 
Issues in the Analysis of Remote Sensing Data at Purdue University, June 
22-23, 1981 , in conjunction with the 198 I Symposium on Machine Process- 
ing of Remotely Sensed Data. Jointly sponsored by Purdue's Laboratory 
for Applications of Remote Sensing (LARS) and NASA, the workshop had the 
following objectives: 

1. To assemble experts in remote sensing and related information-pro- 
cessing and image-processing technologies for the purpose of making 
an up-to-date assessment of the state-of-the-art of machine analysis 
of remote sensing data. 

2. To determine the nature of the key research problems remaining as 
barriers to broader and more effective use of machine analysis of 
remote sensing data. 

3 . To produce a report for use by interested researchers and potential 
research sponsors detailing the findings and recommendations of the 
workshop participants. 


To achieve these objeotives, invitations to participate in the work- 
shop were extended to several weli-establiehed scientists and engineers 
in the field from universities, research institutions, and government. 
The workshop also was publicized in the widely mailed preliminary pro- 
gram of the Machine Processing Symposium. Thirty-six participants wore 
on hand when the workshop was called to order. (See Appendix 1.) 

To establish a common point of departure for the meeting, the report 
entitled "Basic Research Planning in Mathematical Pattern Recognition 
and Image Analysis," by Jack Bryant and L.F. Guseman of Texas A&M Unlv- 
ersity[3] , was mailed to those who registered in advance and was distri- 
buted at the conference to all others who registered. The report sum- 
marized the conclusions of a NASA-ooramissioned working group charged 
with defining a fundamental research program in image processing for 
remote sensing. As such, it provided a natural starting point for the 
discussions planned for the workshop. 

Sessions of the workshop (see Appendix 2 for Workshop Schedule) 
focused on: 

* Data Bases and Image Registration, Including presentations on Data 
Bases for Remote Sensing, Image Preprocessing Operations, and Map- 
Oriented Considerations. 

* Advanced Technology, including presentations on Advanced Digital 
Systems, and Artificial Intelligence Methods. 

* Information Extraction, Including presentations on Classification, 
and Classifier Training Considerations. 

Each session had a reporter assigned to record and summarize key points 
in the presentations and the associated discussion periods. (The pro- 
ceedings compiled by the session reporters may be found in Appendix 3 of 
this report.) The workshop ended with general comments from Mr. R.B, 
MacDonald of NASA/ JSC, representing the workshop cosponsor, concerning 
the near and Intermediate terra outlook for support of fundamental 
research in remote sensing. 


SUMMARY OF WORKSHOP PROCEEDINGS 

With regard to data bases and image registration , it was surprising 
to find a great deal of disagreement on the degree to which improved 
registration and rectification of data are required. There seemed to be 
a general consensus that research is needed in 

• Improved platform control and sensor modeling to reduce the need for 
rectification and registration. 

» Modeling atmospheric effects and the atmosphere point spread func- 
tion. 

• Acquisition and utilization of digital terrain data. 
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• Understanding how to quantify the real needs of the user/application 
for precision rectifloation and registration of the data and the 
degree to which analysis results and user end products are affected 
by errors in registration and rectification. 

The areas of advanced technology which were considered seemed to be 
perceived as somewhat divergent with respect to their prospects for 
near-term applicability to remote sensing of renewable resources. The 
emergence of parallel processing systems, capitalizing on the shrinking 
size and cost of digital computers, was recognized as having great 
potential for amplifying the rate at which digital imagery can be pro- 
cessed; some systems already exist to do this. General applicability of 
this form of advanced digital technology may follow from suooessAil 
research in the direction of 

• Memory architecture and management strategies for interfacing paral- 
lel processing systems and high-volume, high-dimensional remote 
sensing data. 

• Understanding the theoretical speedup limitations of parallel sys- 
tems and the concomitant implications for the cost versus benefit 
tradeoffs involving such systems. 

It was further recoianended that 

• A prototype parallel system using contemporary technology should be 
assembled to demonstrate the theoretical models and validate perfor- 
mance predictions. 

Artificial intelligence, the aim of which is to find ways to make 
computers perform tasks normally thought of as requiring human intelli- 
gence, could eventually lead to automation of the process of obtaining 
high-level Information from pictorial data. Looking at the steps con- 
ventionally followed in proceeding from a scene to a description of a 
scene by way of remote sensing and computer processing, it was observed 
that artificial intelligence research could contribute to 

• Development of scene and season models which will allow reduction of 
raw image data to a form free of incidental variations (*'noise" of 
various forms) without needing local ground truth or ancillary data. 

• Development of scene models and analytical mechanisms which will 
facilitate both the representation and manipulation of information 
available from a scene (e.g. , graph structures and machine-imple- 
mented reasoning processes). 

But there was some skepticism with respect to the near-term applica- 
bility of artificial intelligence research results in remote sensing. 
Some feel that a more fruitful approach would be to concentrate on faci- 
litating interaction of the human analyst with his data. Still, given 
that the potential payoff of success in the artificial intelligence 
domain is very great, near-term progress is hardly a fair criterion for 
prioritizing fundamental research needs* 
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In the information extraction sessions, a recurrent notion was "mix- 
tures." More specifically, two significant issues were how to deal with 
mixtures of dissimilar data in multltype data bases, and how to resolve 
ambiguities resulting from mixture pixels (often boundaries) in image 
data. The latter problem stands as a serious barrier to improved spec- 
tral classification accuracy and proportion estimation accuracy and is 
widely recognized as requiring concerted attention. The former repre- 
sents more an opportunity than a barrier, a source of information about 
the observed scene which the technology has only begun to exploit. Spe- 
cific research issues identified include 

* Quantifying the effects of mixture pixels on classification and pro- 
portion estimation accuracy; finding effective ways to resolve un- 
certainties arising from the presence of mixture pixels. 

* Development of more effective and efficient sampling techniques for 
classifier training, classifier evaluation, and area/proportion 
estimation. 

* Determining meaningful ways to evaluate and compare alternative 
methods for proportion estimation. 

* Development of effective formalisms for characterizing and differen- 
tiating among spatial patterns in complex scenes. 

* Development of statistical models and classification methods appli- 
cable to data sets with components from greatly different sources. 

Appendix 3 contains a more detailed account of the individual pre- 
sentations and discussions comprising the workshop. 


CONCLUSIONS 

Overall, the panel of experts did not take issue strongly with any 
aspect of the Bryan t/Guseman report. Quite appropriately, however, 
there was a strong tendency to focus sharply on basic understanding as 
opposed to, say, algorithm development. Specifically, the discussions 
highlighted the need for; 

1. Understanding and modeling the physical phenomena which produce 
deleterious abberations in remote sensing image data. 

2. Quantification of user needs for precision in image registration 
and rectification in order to understand the real value of these 
operations and impact of residual errors. 

3. Understanding the real potential of parallel confuting systems 
for improving the processing efficiency of large remote sensing 
data sets. 
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Understanding how images capture use Ail information, how humans 
extract that information through reasoning processes, and how 
computers might emulate these processes. 

5. Understanding the impact of mixture pixels on scene analysis 
results and exploration of new approaches for dealing effec- 
tively with them. 

6. Modeling relationships among diverse data sources and under- 
standing how useful information may be extracted from these 
relationships. 
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Appendix 2: WORKSHOP SCHEDULE 


MONDAY. June 22. 1981 
7.^5 a.m. Registration 

8:00 a. m. - 8:30 a.m. Opening remarks, charge to the attendees. 

Philip H. Swain, Purdue University, Workshop 
Chairman 

8:30 a.m. - 11:30 a.m. 

Session I: Data Bases and Image Registration 

Chairman : David Siraonett, University of California, 

Santa Barbara 

Reporter : Paul E. Anuta, Purdue University 

DATA BASES FOR REMOTE SENSING. David Simonett 
IMAGE PREPROCESSING OPERATIONS. Frederick C. Billingsley 
MAP-ORIENTED CONSIDERATIONS. Robert McEwen 


12:30 p.ra. - 2:30 p.;m. 

Session II: Advanced Technology 

Chairman ; Azriel Rosenfeld, University of Maryland 
Reporter ; Philip H. Swain, Purdue University 

ADVANCED DIGITAL SYSTEMS. Howard Jay Siegel 

ARTIFICIAL INTELLIGENCE METHODS. Azriel Rosenfeld 


3:00 p.m. - 5:30 p.m. 

Session III; Information Extraction 

Chairman ; David A. Landgrebe, Purdue University 
Reporter ; Richard S. La tty, Technicolor Graphics, 
Sunnyvale, CA 

CUSSIFICATION. Philip H. Swain 

CLASSIFIER TRAINING CONSIDERATIONS. R. Kent Lennington 
7.30 p.m. - 10.00 p.m. Group discussions, report formulation. 
TUESDAY. June 23. 1981 

1:00 p.m. - 4:30 p.m. Discussion of draft reports. 

4:30 p.m. - 5! 15 p.m. Workshop Wrap-Up. Philip H. Swain 

Assessment: THE WORKSHOP AND THE FUTURE. 

Robert B, MacDonald, NASA 
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Appendix 3 j Proceeding? of the 
W0RKSmVi> ON KEY ISSUES IN THE ANALYSIS 
OF REHOTELY SENSED DATA 

June 22-23, 1981 

SESSION I: DATA BASES AND IMAGE REGISTRATION 

Reporter; Mr. Paul E. Anuta 


Introduotlon 


The activities of participants in Session I consisted of three over- 
view papers and discussion in the morning session and a discussion the 
evening of the first day and finally a review discussion the second day. 
The scope of topics discussed went beyond the title subjects and covered 
most of the scope of the Bryan/Guseman document (registration, rectifi- 
cation, radiometric correction data structures, and others). 

This report is in three parts; (1) an overview of the formal pre- 
sentations, (2) an account of the discussions that took place, and (3) a 
statement of the conclusions regarding key Issues and changes or addi- 
tions to the Bryan t/Guseman report. 


Speaker Presentations 

1. DATA BASES FOR REMOTE SENSING — David Slraonett 

Dr. Simonett spdce on three questions related to data processing: 
(1) the general question of rectification, (2) extension to a variety of 
data sources, and (3) total systems for multiple data sources. 

The basic question he posed was: To what extent is high-precision 
rectification needed by users? He asked: "Is it better to strive for 
high rectification accuracy or accuracy adequate for purposes at hand,” 
citing the USDA example of tying Landsat results in with their ground- 
based systems. He pointed out that many microprocessor systems availa- 
ble to users have varying degrees of capability for doing preprocessing. 
The Bryant/Guseman report stressed absolute accuracy, he said, and he 
believes this is "largely irrelevant,” 

The point was also made that whole-frame processing is probably not 
needed; only small areas are generally processed. ”Why do precision 
processing for all data?” he asked. 

Dr. Simonett believes there is a flindamental indeterminacy in the 
data which limits the ultimate accuracy. He asked what the incremental 
improvement is which can be obtained by improving registration accuracy. 
He suggested that this should be a research study. 
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Examples of registration were disoussed: S-192 had interband misre- 
gistration. Eastern Maryland area. Needed to know if 1/2 pixel misre- 
gistration was a serious problem. In oase of radar, oould not use topo- 
graphic highs as controls. Only rivers and long linear features are 
common to MSS and radar. 

With regard to RH*<ltiple data sources, nominal as well as oti^noul 
data must be considered. The most widely used projection is UTM. 

In summary, he listed the following key items: 

* The question of how precisely rectification should be done. 

* Should we routinely rectify to UTM? 

* What are alternative strategies to the whole scene, high-preci- 
sion registration/reotificatlon approach? 


Then slides were shown to illustrate the issues: 

1. Hegional Analysis for Geology 

Illustrated methods are becoming too automatic and a great deal 
can be obtained from manual interpretation aids. 

2. Land Use Planning and Management 

Percentages of information obtained from remote sensing, given: 

60$ certain land use parameters 
30 % landscape parameters 
20 % socio-economic information 

3. Land Use Data Base 

Can improved accuracy be obtained by better sampling rather than 
by improving registration? 

Scene and Subscene Statistics (India) 

Are global corrections to whole scene desired? 

Are global scene statistics valid? 

5, Area in Australia 

6, Unoorrected data 

7, Linear stretch 
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8. Areas predicted as geologically similar 

Fine lines in scene. 

Misregistration would be very serious. 


9* Felsic Volcanics 

Serious problem of any misregistration. 


10. Question of whether multidate is useful. 


11. W. Australia 
Two dates. 

Would precise registration be of benefit? 


12. San Francisco Peninsula 

Change in broad areas of interest. 

Would high registration accuracy be needed? 


13. W. Australia 

Enhancements vs. raultidate. 

Ratios, principal components, and Band ^/PCl. 

Abundant geological information. 

”We need to define the degree to which problems presented in the 
Bryemt/Guseinan report are crucial to success of remote sensing goals,” 
Dr. Simone tt said. 


2. IMAGE PROCESSING OPERATIONS — Frederick C. Billingsley 

Dr. Billingsley discussed preprocessing problems generally, follow- 
ing the Bryant/Guseman report. He presented a set of slides and over- 
heads containing key items for consideration and also some interesting 
research results. The overheads are reproduced on the following pages. 
Three slides were presented first, containing results of studies on 
effects of noise and misregistration on classification accuracy. It was 
pointed out that misregistration causes significant loss of accuracy for 
crop fields of typical midwest size. 
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FUNDAfENTAL RESEARCH WORKSHOP 
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FUNDA^ENTAL RESEARCH WORKSHOP 
SPATIAL FREQUENCY ISSUES 


INSTRUFENT PSF GENERALLY UNDERSTOOD^ BUT CORRECTIONS 
NEED DISSEMINATION 


ATMOSPHERE PSF UNKNOWN 


INTERPOLATION FREQUENCY RESPONSES UNDERSTOOD BUT 
NEED REITERATION 
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FUNDAMENTAL RESEARCH WORKSHOP 
ERROR ESTIMATIONS 

DETERfilNE/DEFINE ERROR EEASURES 

GROUND LOCATION ACCURACY MAPS 

TEMPORAL OVERLAY ACCURACY 

RADIOMETRIC ERROR BUDGET 

RADIOMETRIC MOSAIC SEAM PROBLEM 

REUTIVE VALUES/DIFFICULTIES OF ABSOLUTE 
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ANCILLARY SENSING 
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FUNDMNTAL RESEARCH WORKSHOP 
IMPLICATIONS TO SENSOR SYSTEMS 


FUTURE STRESS ON EASE OF DATA PROCESSING, GREATER 
INSTRUF£NT ACCURACY 

SENSOR GROUPING WILL LEAD TO PLATFORM APPROACH - 
GROUPING ON ORBIT, TIME OF DAY 

GEOGRAPHIC REGISTRATION REQUIREMENTS LEAD TO NEED FOR 
BETTER POSITIONING AND POINTING ACCURACY 

TIMELINESS AND DIRECT RECEPTION LEAD TO DESIRABILITY CF 
SOME ON-BOARD PROCESSING 

INCREASED RESOLUTION, PARAMETERS MEASURED WILL LEAD TO 
NEED FOR DATA COMPRESSION/AVOIDANCE - POINTABLE SENSORS 
WITH SELECTABLE RESOLUTION 

GROUND PROCESSING CAPABILITIES MUST INCREASE - USE OF 
SPECIAL PURPOSE AND PARALLEL PROCESSING INDICATED 
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3. MAP-ORIENTED CONSIDERATIONS — Robert McEwen 

Dpt McEwen opened with a slide of a French news article of 1074 
stating that photographic surveys from balloons were impractical due to 
high coat. He then turned to the Cartography vs. Remote Sensing dicho- 
tony : 

There is a feeling towards not using cartographic terms. USGS could 
not call the Florida mosaic a map . What do cartographers do? They 
endeavor to do categorization and spend a great deal of time deciding 
what should be mapped. Categories must have meaning to someone. 

Registration — in the domain of one sensor. If we move outside to 
other sensors, then we need to do rectification. 

Digital Cartography ~ (This is not automatic cartography.) The ability 
to convert map data into a computer-readable environment is a key 
requirement. How do you tell a computer what is next to what? Topolo- 
gic relationships are Important. It will be possible to combine remote 
sensing and map Information in the computer to form a powerful data 
base. Example: Digital Terrain: USGS digitizing 7-1/2 minute quadran- 

gles. Maps have 7m vertical accuracy, 30m pixel spacing. These data 
can help in Lambertian considerations. Other planimetric categories are 
being digitized, e.g*, public land boundaries, stream courses. Logic is 
needed for closing polygons. "The Digital Cartographic Revolution will 
have a profound impact on remote sensing." 

Models — Geometric models lousy. Always operationally changing our 
modus operand!. Many models studied. 

Issues — Settle down to doing business certain ways. Have to deal with 
a lot of ground control points. How many are needed? 12 - 180 to check 
accuracy. NASA uses 40 for current correction in the master data pro- 
cessor. 

Photogramme try ~ Put photos together in blocks rather than try to get 
control for each photo. Blocks joined mathematically. Boundary and 
cantilever problems. Better to determine and control the attitude of 
platform. Inertial systems are getting good enough to eliminate need 
for control. Photogrammetric surveys without control are possible. 

Rule of thumb: 10-to-1 ratio of total error to individual error. 

Issue of Design — What is user. perception of output of system? Need to 
do more in design of the output. 

Publication — We are moving into electronic media. Instant throwaway 
maps will become more common. 
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Overvlew Dlsouaalon 

* It was stated that renewable resources applications do need carto- 
graphic accuracy, so this was a relevant subject for research. 

* On the Registration/Rectification Dichotomy; The difficulty in 
achieving given accuracies needs to be established. Provide new 
materials to the user. Let the user determine the need and cost to 
achieve a given level of reglstratlon/reotlficatlon accuracy. 

* Dr. Harallck suggested an intermediate product with control. It was 
pointed out to him that this is the format of the EDO tape. Dr. 
Landgrebe pointed out that we need to worry more about the basic 
research questions and not about procedural questions. 

• Dr. Haralick 'ilso questioned the need of subpixel registration, ask- 
ing; ’’Why not find classifiers which can perform well under misre- 
gistration?” 

• We need to determine the cause of bias in extending class decision 
to field boundaries. 

• We need to look at historical boundary location to predict current 
locations. 


Key Issues From Data Bases and Image Registration 

• Evaluation of the benefit of improved registration/rectification 
must be done. 

• Investigate other strategies for registration/rectification rather 
than total processing. 

• Must include spatial frequency effects analysis in with previously 
cited radiometric and geometric considerations, and also include 
atmospheric point-spread function. 

• Must continue to pursue better platform control and modeling to eli- 
minate need for registration/rectification. 

• Must address the issue of what are the desirable cartographic pro- 
ducts from remote sensors. 
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SBSSION II; ADVANCED TECHNOLOGY 

Reporter; Dr. Philip H, Swain 


Introduction 

The purpose of the Advanced Technology session was to assess the 
potential roles which selected subfields of computer science might play 
if a focused effort were made to apply them to remote sensing of renewa- 
ble resources. The formal speakers, both of whom are very well known in 
their own areas of expertise and well acquainted with the general remote 
sensing problem, discussed the application of advanced digital systems 
and methods of artificial intelligence. 


Speaker Presentations 

1. ADVANCED DIGITAL SYSTEMS ~ Howard Jay Siegel 

Dr. Siegel introduced the basic concepts and terminology necessary 
to discuss parallel computing methods and systems and gave several exam- 
ples of parallel algorithms applicable to remote sensing image process- 
ing. The examples Included maximum likelihood classification, image 
smoothing, histogranming, and two-dimensional Fast Fourier Transforms. 
He pointed out that some parallel machines have already appeared on 
which these algorithms can be (in some cases, have been) Implemented. 
Familiar examples include STARAN and Illiac IV. These conq^utlng systems 
were developed to do general parallel processing tasks. Relatively lit- 
tle research has been done to exploit the potential of parallel process- 
ing specifically for multivariate image processing. 

Dr. Siegel described some testbed systems being developed which 
could provide opportunities to assess this potential. Compared to 
existing systems, the embryonic systems will have considerable architec- 
tural flexibility and have far better facilities for program development 
and testing. 

From Dr. Siegel’s presentation and the associated discussion, the 
following research recommendations emerged; 

1. Investigate and evaluate in quantitative terms the potential 
benefits which could be derived by applying parallel processing 
to remote sensing image processing tasks. 

2. Develop memory management strategies for interfacing parallel 
processing systems and high-volume, high-dimensional remote 
sensing data. Memory management and associated Input/output 
represent the most serious bottleneck in parallel preocessing of 
large-scale remote sensing imagery. 

3. Carry out a realistic cost /bene fit study on the use of parallel 
systems in remote sensing applications using (a) off-the-shelf 
component characteristics, and (b) foreseeable digital computer 
technology . 
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4. Develop methods for predicting theoretical speedup limitations 
for parallel Implementations of remote sensing tasks. 

5. Develop Implementation procedures vihlch facilitate optimal 
Implementations of remote sensing processing algorithms. 

6. Develop a prototype slngle-lnstructlon-stream mu It Ip le -data - 
stream (31MD) system (e.g. , an array of microprocessors) to 
demonstrate the Implementation of theoretical models and vali- 
date performance predictions. 

These recomnendations are In concert with, but somewhat more spe- 
cific than, the recommendations which appoeared in the Bryant/Guseman 
reference report. 


2. METHODS OF ARTIFICIAL INTELLIGENCE — Azriel Rosenfeld 

Confessing to not being in the business specifically of making 
things easier for remote sensing. Dr. Rosenfeld described his research 
interest as making computers do intelligent things (not necessarily by 
practical means). Given the volume of data to be analyzed and the rela- 
tively slow, inconsistent and labor-intensive methods now widely used 
for renewable resources applications of remote sensing, Rosenfeld 's 
interest is very relevant to the topic at hand. Fundamental research 
along these lines may lead to important breakthroughs in both the effec- 
tiveness and efficiency of remote sensing data analysis. 

Rosenfeld ’s presentation and the ensuing discussions raised the fol- 
lowing research issues which are or could be addressed by the methods/ 
concepts/philosophies of artificial intelligence (refer to the figure on 
the next page) : 

1. Develop scene-dependent models for image correction, to deter- 
mine and remove systematic distortion, which permit automatic 
determination of corrections from the image itself. 

2. Develop methods for deriving intrinsic images (images devoid of 
incidental, non in format ion -bearing variations) independent of 
ancillary data (e.g. , remove effects of terrain relief from 
spectral response data). 

3. Develop region-level models for scene segmentation and abstrac- 
tion to derive syndjolic images for manipulation and analysis. 

4. Investigate the use of hierarchical graph structures (pyramids, 
quadtrees, etc. ) for representing scene information content. 

5. Develop analytical methods for producing informative descrip- 
tions from symbolic representations, e.g., reasoning processes 
such as are embodied in ’’expert systems.” 


IMAGE ANALYSIS PARADIGM (Due to A. Rosenfeld) 
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In the discussion, Rosenfeld was asked whether, in the perspective 
of the next 5 - 10 years, it really makes sense to put a lot of emphasis 
on development of artificial Intelligence methods as a substitute for 
user/analyst interaction with the data and processing results# The 
implication was that the artificial Intelligence approach may be a very 
long way from being developed to the level of practical application. 
Rosenfeld pointed out, however, that some rather promising results are 
already being achieved in applying these techniques to mineral prospect- 
ing via remote sensing imagery and other uses of aerial interpretation 
both here and abroad. 
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SESSION m; INFORMATION EXTRACTION (Object Scene Inference) 
Reporter: Dr. Richard S, La tty 

Introduction 


The purpose of this documentation is to augment and complement the 
content of the Bryan t/Guseraan report directed at identifying, formaliz- 
ing, and prioritizing the current issues in digital image analysis. The 
manner in which this is conducted is through the compilation of material 
presented and the ensuing discussions in the area of object scene infer- 
ence. This material is organized by speakers since each represented a 
sufficiently distinct area and frame of reference pertaining to object 
scene inference. The background material presented is summarized to 
provide orientation. 


Speaker Presentations 

1. SYSTEMS CONCEPT OF INFORMATION EXTRACTION — Philip H. Swain 

Information extraction in remote sensing, from a total systems 
standpoint, is the process of transforming the actual physical scene 
into a body of information. This systems level process involves many 
related but distinct processes. These usually involve; Distributions 
of electromagnetic energy incident on the scene; object composition and 
orientation relative to the source of illumination; the consequent 
reflectance, or emittance; atmospheric modulation and attenuation; pho- 
ton reception; amplification (which may involve a transformation, e.g., 
log transform); quantization; recording and/or telemetry; satellite 
relay and/or ground reception; conversion; calibration (band-to-band, 
Within band); geometric and response level rectification or adjustment; 
generation of a discriminant function or decision rule (in a concrete 
form), or some other activity intended to convert some representation of 
measured irradiance values into some ’'meaningful” information; and 
higher order information generation through modeling or data base inte- 
gration and manipulation. 

While the concern throughout this presentation is with the extrac- 
tion subprocess, the success attained in meeting the objectives of the 
analysis is dependent on the degree to which properties of interest are 
contained in the scene, and to what degree these properties are pre- 
served in the data. What is not intrinsically contained in the scene 
cannot be contained in the data obtained from the scene. Secondly, the 
desired information must be retained in the data from the scene. The 
measurable properties of the data must be consistently dependent on the 
properties of the scene which are of interest with respect to the objec- 
tives of the processing task. These properties may be measurable as 
patterns in the spectral domain (roultispectral data), the temporal 
domain (multidate data), the spatial domain (as in textural computations 
or spatial associations), or some combination of these and other 
domains. 
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The activity of information extraction must preaume that the desired 
information is contained In the properties of the scene, Is retained in 
the data, and Is of sufficient impact on the outcome of the decision- 
making process to warrant the Information extraction activity. These 
can be considered aS real-variable, continuous constraints on the war- 
ranted level of expenditure for the Information extraction activity. 
The remaining requisite for information extraction is to identify how 
the information is contained in the data; that is, what are the informa- 
tion-bearing characteristics (the "features”) of the imaged scene? Is 
the information represented in patterns in the spectral, temporal, spa- 
tial, or some other domain? Once the information-bearing features are 
identified, the problem becomes one of selecting a method of information 
extraction. Candidate procedures can be evaluated relative to an 
increasingly critical sequence of selection criteria. First, the candi- 
dates must be admissible; they must provide, to some degree, the desired 
information. Secondly, they must be feasible procedures; they must not 
require unavailable resources nor place unmanageable demands on the 
available resources. Lastly, the selected procedure would ideally be 
"optimal"; that is, the ratio of the value of the information extracted 
to the cost of the information obtained would be greatest of all candi- 
date techniques. 

Candidate techniques for extracting the information from the data 
inc lude : 


1. Various data transformation activities conducted to render the 
data more compatible with the particular extraction technique to 
be used. 

For change detection, this might involve band ratios. 

For multi-univariate classifiers, this might involve a principal 
components analysis, canonical analysis, or some other dimension 
reduction transformation. 

2. "Per-point" classification algorithms. 

Parallelpiped classifier. 

Minimum Euclidean distance classifier. 

Gaussian maximum likelihood classlfer (assumes a covariance 
which varies with respect to cover class). 

3. "Sanx>le" classifiers. 

Classification decision rules based on statistical distance mea- 
sures between multivariate Gaussian distributions. 

Employ either fixed or variable pixel grouping size; variable 
pixel grouping size requires image partitioning to precede the 
classification. 
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TexturaX measures oonputed over a moving window. 

Vary widely In specific form and computation, but can be cate- 
gorized as being either a ^eoifled oomputatlorial transform 
(e.g., entropy, second angular moment,...), or least squares 
estimation of descriptive coefficients (e.g. , facet model, Four- 
ier aeries, ... ). 

5. Algorithms which examine the neighborhood of pixels and employ 
measures of class frequency, or examine the relative evaluated 
probability density function associated with the neighborhood, 
fall in the category of contextual classifiers. 

6. Algorithms which arrange the decision logic in a sequential or 
decision tree approach provide a means of employing widely 
different forma of data in the information extraction process. 

Information extraction employing remotely sensed data does not cease 
after the process of classification but continues through the integra- 
tion with other data types. Through modeling and other inference-making 
processes, the information extraction process encompasses a much broader 
range of operations. 


Research Issues 


1. Accurate statistical models of the data are needed for data of 
widely different data sources (from different scanners, linear arrays, 
radar, passive microwave, geophysical data, scatteroroeters, map bases, 
terrain data, soils data, meterological data, geodetic data,...). 

2. More flexible multistage analysis techniques are needed. This 
is dependent on a more thorough understanding of the relationships be- 
tween data type and information type in a multi-node or multi-level 
decision tree process approach to Information extraction. 

3. The need exists for a more thorough understanding of how to for- 
malize spatial relations in digital data and how these relations corres- 
pond to the desired information. 

How is mathematical formalization of spatial relations to be 
assessed relative to other formalizations or computations in a manner 
which is meanlngAJl in terms of accuracy and precision in the informa- 
tion extraction process? 

5. A thorough understanding is needed of what determines the extent 
of the region for which spatial relations consistently represent infor- 
mation about the scene components. A means of determining the region 
size for different scenes also is needed. 
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2. PROPORTION ESTIMATION — R. Kent Lennlhgton 


Proportion estimation is oonoerned with obtaining an estimate of the 
relative area oooupled by each oonstltuent ground cover in a given 
scene. Proportion estimations can be obtained by any of several analy- 
sis procedures embodied in the techniques employed by remote sensing. 
The task of training the classifier provides an estimate of the areas 
oooupled by each cover class if the samples are selected at random, are 
sufficiently numerous to provide stable estimates (l.e. , small variances 
associated with each estimated proportion),' and each sample unit is 
accurately identified. Problems arise in the Implementation of the 
training techniques due to; inabilities to correctly identify the sam- 
ple unit in terms of the classes contained in the scene, inabilities to 
locate the sample unit in terms of the classes contained in the scene, 
inabilities to locate the sample unit accurately in the reference data 
(registration error); also the spatial extent of the sample unit may not 
be compatible with the spatial extent of some of the classes in the 
scene (error due to spatial resolution). Other problems arise due to 
the cost of acquiring a large sample size. 

While classification and proportion estimation place similar const- 
raints on the training procedure, the relationship between classifica- 
tion and proportion estimation is less clear. Classification provides a 
statistically biased proportion estimate due to the inequality of the 
errors of omission and commission, which arise from inequality of the 
spectral covariances between the classes in the scene. Furthermore, if 
classification is to be warranted in addition to the estimate obtained 
from the training procedure, the number of samples used in training the 
classifier should be relatively small. Hence the classification is 
either likely to be inaccurate or constitute an unwarranted additional 
expense. 

Attempts to reduce the bias introduced by classification have been 
made by using the classification to stratify the scene for the purpose 
of sarqsling effort allocation. The process is generally regarded as 
wasteful and equally good estimates are provided by increasing the ini- 
tial sanpling effort. 

Other approaches employed in obtaining the proportion estimate 
include employing a small and simple training sample. Estimating the 
c.lassifier and the omission/coramission matrix is conducted through a 
jack^^hife approach. The proportion estimates are then adjusted through 
the estimated omission and commission frequencies* Clustering the 
entire scene, to stratify the scene, also has been enployed as a means 
of sample effort allocation. These techniques have met with similar 
problems encountered in the aforementioned approaches. 

The need to accurately label the sample units is in direct conflict 
with the need for the sample units to be representative of the popula- 
tion of which the sample units are members. In order to be representa- 
tive, the sample must include atypical (in a spectral sense) as well as 
typical sanple units. In order to achieve this representation, the sam- 
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pling ahould be objective (e.g. , systematic random sample). Typical 
samples are generally labeled with a high degree of accuracy. Atypical 
samples are labeled with lower accuracy. Methods of resolving this 
problem have employed attempts to estimate the error frequency associ- 
ated with labeling and adjust for this frequency in the classifier 
design or proportion estimate. 

Another, more promising, approach is to adopt a method wherein the 
sample Includes typical and atypical sample units but only the typical 
sample units are employed in the labeling process. A mixture model is 
employed to formalize .the problem and assist in designing the approach 
employed to resolve the problem. 

The component densities of the mixture are assumed to be multlvari- 
ate-normal and hence each component density is distinguished from other 
components in the mixture through a Gaussian, iterative estimation tech- 
nique employing a sequence of split/comblne processes. This provides an 
objective means of associating the atypical and typical samples (pro- 
vided the multivariate-normal assumption is satisfied and that for every 
mode there occur typical samples). The proportion estimate can then be 
obtained through a summation of the components associated with each 
class in the scene. 


Research Issues 


Implicit to the area of proportion estimation baaed on spectral data 
are numerous problems associated with the relationship between patterns 
in the spectral domain and each class and mixtures of classes contained 
in the scene. Problems which arise pertaining to this relationship are; 

1. What is the influence of boundary pixels on the accuracy of pro- 
portion estimation? 

2. How can boundary pixels best be accommodated in the context of 
proportion estimation? 

3. If area estimation is being conducted for crop production esti- 
mation, is there information germane to this objective in the evaluated 
discriminant functions or the relative location of ’’field-center” pixels 
and boundary pixels in the spectral space? 

4. What types of information that may be available should be 
employed in systematizing the sampling allocation in order to obtain a 
more representative and effective sample? 

5. How are various proportion estimation techniques to be con^jared 
in a meaning fill and consistent manner? 
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3. SPATIAL PATTERNS — Robert M. Haraliok 

(Based on material submitted after the v^orkshop) 


The limitations encountered in information extraction from digital 
images through employment of per-point classifiers, or analysis 
approaches which examine the gray level value variation over small local 
regions, become more pronounced for more complex scenes and imagery 
obtained at higher spatial resolutions. Overcoming these limitations 
will be dependent on advances in use of spatial patterns in the image. 
To facilitate these efforts, an exact and comprehensive language of spa- 
tial patterns is needed. Only with such a language can the approach be 
formalized and structured into algorithms compatible with machine pro- 
cessing. 

Spatial patterns are a Ainction of the organization of physical 
objects in the three-dimensional spatial world and are rendered apparent 
through the presence of a spectral difference among the objects arranged 
in the 3-D space. These are the "ground spatial patterns." These spa- 
tial patterns are modified and transferred to the image, depending on 
the geometric relations between the source of illumination, the reflect- 
ing surfaces of the objects in the scene, the point and orientation at 
which each fraction of the scene is imaged, and the spatial extent of 
each fraction. The resulting spatial pattern is the "image spatial pat- 
tern." Therefore, an association exists between image spatial pattern 
and ground spatial pattern. Such an association could be formalized 
through the conditional probability; 

P(I|G) = P(image spatial pattern| ground spatial pattern) 

To actually Implement this formalization in a decision rule, we need a 
means of parametrically defining each coimp>onent of the conditional prob- 
ability. The parametric representation is dependent on the language of 
spatial patterns. This language is the means of describing the data 
structure which constitutes the image spatial pattern. To be compatible 
with the idea of parametric representation, there must be one data 
structure which serves as a good, or central, representation of each 
generic kind of spatial pattern. Generic kind connotes some level of 
dissimilarity from all other existing data structures. The level of 
dissimilarity needs to be measurable through some real number represen- 
tation which can be determined through a comparison of the representa- 
tive data structures. 

Research Issues 

1. There is a need then to evolve a concise language of spatial 
patterns. 


2. The appropriate data structures with which to represent each 
generic kind of spatial pattern need to be determined. 

3. A dissimilarity function is needed to represent the separability 
of these data structures. 


